Tenogenic induction of human adipose-derived stem cells by soluble tendon extracellular matrix: composition and transcriptomic analyses

Background Tendon healing is clinically challenging largely due to its inferior regenerative capacity. We have previously prepared a soluble, DNA-free, urea-extracted bovine tendon-derived extracellular matrix (tECM) that exhibits strong pro-tenogenic bioactivity on human adipose-derived stem cells (hASCs). In this study, we aimed to elucidate the mechanism of tECM bioactivity via characterization of tECM protein composition and comparison of transcriptomic profiles of hASC cultures treated with tECM versus collagen type I (Col1) as a control ECM component. Methods The protein composition of tECM was characterized by SDS-PAGE, hydroxyproline assay, and proteomics analysis. To investigate tECM pro-tenogenic bioactivity and mechanism of action, differentiation of tECM-treated hASC cultures was compared to serum control medium or Col1-treated groups, as assessed via immunofluorescence for tenogenic markers and RNA Sequencing (RNA-Seq). Results Urea-extracted tECM yielded consistent protein composition, including collagens (20% w/w) and at least 17 non-collagenous proteins (< 100 kDa) based on MS analysis. Compared to current literature, tECM included key tendon ECM components that are functionally involved in tendon regeneration, as well as those that are involved in similar principal Gene Ontology (GO) functions (ECM-receptor interaction and collagen formation) and signaling pathways (ECM-receptor interaction and focal adhesion). When used as a cell culture supplement, tECM enhanced hASC proliferation and tenogenic differentiation compared to the Col1 and FBS treatment groups based on immunostaining of tenogenesis-associated markers. Furthermore, RNA-Seq analysis revealed a total of 584 genes differentially expressed among the three culture groups. Specifically, Col1-treated hASCs predominantly exhibited expression of genes and pathways related to ECM-associated processes, while tECM-treated hASCs expressed a mixture of ECM- and cell activity-associated processes, which may explain in part the enhanced proliferation and tenogenic differentiation of tECM-treated hASCs. Conclusions Our findings showed that urea-extracted tECM contained 20% w/w collagens and is significantly enriched with other non-collagenous tendon ECM components. Compared to Col1 treatment, tECM supplementation enhanced hASC proliferation and tenogenic differentiation as well as induced distinct gene expression profiles. These findings provide insights into the potential mechanism of the pro-tenogenic bioactivity of tECM and support the development of future tECM-based approaches for tendon repair. Supplementary Information The online version contains supplementary material available at 10.1186/s13287-022-03038-0.


Introduction
Tendon is a fibrous band of collagenous tissue that connects muscle to bone and functions in force transmission during musculoskeletal movement [1]. Tendon injuries and diseases carry significant morbidity and are estimated to account for 45% of the more than 32 million musculoskeletal injuries in the USA each year [2]. These injuries have a prolonged recuperation period because of the intrinsically poor natural healing response of tendon tissues due to low cellularity and low vascularity [3]. Current clinical treatments for small tendon injuries can usually restore tissue integrity, but full functionality is rarely attained [4]. On the other hand, large-to-massive tendon injuries entail high re-rupture rates and various complications can persist several years postinjury [5]. Within this context, numerous growth factors have been utilized for tendon repair, including fibroblast growth factors (FGF), platelet-derived growth factors (PDGF), insulin-like growth factors (IGF), transforming growth factor-beta (TGF-β), bone morphogenetic protein 2 (BMP2) and connective tissue growth factors (CTGF) [6,7]. In addition to these growth factors, extracellular matrix (ECM)-based approaches have also been widely studied and utilized in clinical practice and biomedical research for tendon repair [8]. The ECM is a complex three-dimensional network of interacting macromolecules that occupies the space between cells and is principally responsible for both force transmission and tissue structure maintenance [9]. The ECM is also unique in its tissue-specific bioactivity because each tissue or organ contains a unique ECM composition that contributes to tissue-specific structure and function [9,10]. Indeed, a number of ECM-based, Food and Drug Administration (FDA)-approved biomaterials, such as GraftJacket ™ (Wright Medical Group), Zimmer ® Collagen Repair Patch (Zimmer, Inc.) and TissueMend ® (Stryker), have shown promising tendon healing potential.
When developing ECM-based biomaterials for therapeutic use, a major issue is that their bioactive properties can differ due to batch-to-batch preparations. Such variability can be caused by a variety of reasons, including donor age and health, tissue source, storage conditions and decellularization methods [11]. There are many methods to extract tissue-derived ECM for clinical application, with acid-pepsin digestion being one of the classic techniques [10]. During the acid solubilization and/or pepsin digestion process, slight changes such as differences in pH or ionic concentration can lead to inconsistencies in ECM quality [12]. Additionally, tissue ECM produced by acid-pepsin solubilization, such as some of the commercial ECM scaffolds mentioned above, is composed primarily of collagens. For example, GraftJacket ™ is an acellular human dermal collagen matrix, which has been shown to provide functional support and reinforcement of tendon/ligament tissue. TissueMend ® , which is designed to reinforce the tendon during repair and facilitate tissue remodeling, is also a collagen matrixbased material derived from fetal bovine skin. Although tendon ECM is predominantly composed of collagens, which account for around 60-85% of its dry weight [13], its non-collagenous ECM components, such as proteoglycans, glycoproteins and glycoconjugates, also play important biological roles, including collagen fibril formation and tenocyte homeostasis that further contribute to overall tendon function and repair [13,14]. However, details of the organization and hierarchical locations of these non-collagenous ECM components are generally less well understood. In addition, the activities of tendon ECM-sequestered biofactors must also be taken into consideration [13]. Therefore, further work on deciphering the composition and the mechanism of action of tendon ECM will be needed to elucidate the issues of donor-todonor variability and subsequently engineer well-defined, bioactive ECM-based products.
To improve the yield of non-collagenous components during tendon ECM extraction, our laboratory has previously developed a urea-based method to prepare a soluble, DNA-free, ECM fraction from bovine tendons (tECM), which exhibited strong pro-tenogenesis effects on human adipose-derived stem cells (hASCs) [15]. In this study, to further investigate the mechanism of action of tECM bioactivity, we characterized the protein compositions of tECM by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis, hydroxyproline assay and mass spectrometry (MS)-based proteomics analysis. Subsequently, we compared the tenogenic activity of hASCs upon exposure to tECM with that in the presence of collagen type I (Col1), a major tendon ECM component, on the basis of established tenogenesis-associated markers detected by immunofluorescence staining. In addition, RNA sequencing (RNA-Seq) and bioinformatics analysis were applied to characterize the transcriptomic effects of tECM on hASCs.

Hydroxyproline assay
Collagen concentration in tECM was estimated using a chloramine-T hydroxyproline assay [10] and standardized with commercial bovine collagen solution (Advanced BioMatrix, Inc., USA).

SDS-PAGE and in-gel trypsin digestion
tECM preparations were subjected to SDS-PAGE in 8% gel as previously described [16]. Based on calibration using molecular weight markers, Coomassie Blue-stained gel bands below 100 kDa were excised and processed using the In-Gel Tryptic Digestion Kit (Thermo Fisher Scientific) for peptide extraction according to the manufacturer's protocol. Briefly, after destaining with 2 mg/ mL ammonium bicarbonate in 50% (v/v) acetonitrile (ACN) at 37 °C, gel pieces were processed for alkylation reduction using 50 mM Tris [2-carboxyethyl] phosphine (TCEP) and 500 mM iodoacetamide (IAA). The gel pieces were then ACN-dehydrated and rehydrated with 50 μL trypsin (10 ng/μL). After digestion at 30 °C overnight, the tryptic peptide solution was vacuum-dried.
Images were digitally captured using the Olympus IX83 microscope (Olympus, Japan). For each group, 3-4 samples were randomly chosen and thereafter, 2-3 images for each sample were randomly selected to quantify fluorescence intensity and coverage via NIH ImageJ as described previously [16]. Briefly, cell counts were computed by measuring the number of DAPI-positive cell nuclei. Nuclear fluorescence intensity was calculated as the mean fluorescence intensity within the cell nuclear area per field (30-40 nuclei were randomly selected per image). Fluorescence coverage was determined by measuring the percentage of fluorescently labeled area per image. Cell counts, nuclear fluorescence intensity and fluorescence coverage were statistically compared among groups using one-way ANOVA with Tukey post hoc tests.

RNA-Seq analysis
RNA-Seq was performed using Illumina's next-generation sequencing workflow [20]. Briefly, complementary DNA (cDNA) libraries from cell differentiation were obtained from hASCs cultured for 6 days, including the following groups: (1) control FBS group, basal medium; (2) Col1 group, basal medium with 10% (v/v) 1 mg/mL Col1 solution; and (3) tECM group, basal medium with 10% (v/v) 1 mg/mL tECM solution. RNA was isolated and library construction was performed as described previously [20]. Briefly, cellular RNA was isolated, reverse-transcribed into cDNA, purified using AMPure XP beads (Beckman Coulter, USA) and PCR-amplified for RNA-Seq library construction using TruePrep DNA Library Prep Kit V2 for Illumina (Vazyme, China) in accordance with the manufacturer's protocols. The average full length of the library was around 450 bp, and the purified library was stored at − 20 °C until further analysis. Before RNA sequencing analysis, cDNA library quality was assessed as previously described [21]. Samples were sequenced by the Novaseq 6000 system (Illumina, USA) using approximately 150 base-pair paired-end RNA-Seq technology with 60-90 million reads per sample. Genes with > twofold change and false discovery rate (FDR) < 0.05 were considered as differentially expressed genes (DEG). GO analysis was performed using the PANTHER classification system [22] (http:// geneo ntolo gy. org). Pathway analysis was performed by using the Kyoto Encyclopedia of Genes and Genomes (KEGG) in the Database for Annotation, Visualization and Integrated Discovery [23] (DAVID, Resources 6.8, https:// david. ncifc rf. gov). Gene set enrichment analysis (GSEA; v 4.1.0, https:// www. gsea-msigdb. org/ gsea/ index. jsp) was performed to examine the significantly enriched KEGG pathways [24] (see Additional file 1 for experimental details).

Quantitative PCR (qPCR) assay
To validate transcriptome profiles, qPCR was performed as described previously with minor modifications [16]. At the indicated time points, cellular RNA was isolated using a Monarch ® Total RNA Miniprep Kit (New England Biolabs, USA) and reverse transcribed into cDNA using LunaScript ® RT SuperMix Kit (New England Biolabs) with 100 ng RNA. qPCR was performed using Luna ® Universal qPCR Master Mix (New England Biolabs) on a QuantStudio 7 Flex Real-Time PCR system (Applied Biosystems, USA) according to the manufacturers' instructions. Relative expression of each target gene was calculated using the ΔΔCT method and normalized to glyceraldehyde 3-phosphate dehydrogenase (GAPDH) mRNA expression. All primer sequences are listed in Table 1.

Characterization of tECM protein composition
Urea extraction of tendon ECM is shown in Fig. 1A. SDS-PAGE analysis of individual batches of tECM extracts showed a consistent protein pattern, including collagen bands (α1 and α2 chains) and low to medium molecular weight proteins (< 100 kDa) (Fig. 1B). Hydroxyproline assay of the different tECM batches showed an average collagen content of 246.32 ± 106.70 μg/mg protein (mean ± SD) in the tECM solution (Fig. 1C).
Compared to the extensive studies on the structural and biological roles of collagens in tendon, the functional aspects of the non-collagenous ECM components are less defined. To further identify low to medium molecular weight proteins in tECM, proteins < 100 kDa in molecular weight from three batches of tECM were extracted from the SDS-PAGE gel and subjected to in-gel tryptic digestion. Thirty-six proteins were identified as listed in Table 2. Based on 'The Matrisome Project, ' which was used to define the putative ECM proteins in silico and proteomic approaches [17], 19 tECM proteins (< 100 kDa) were categorized as core matrisome and matrisome-associated proteins. These proteins included collagens (6%), proteoglycans (22%), glycoproteins (14%), secreted factors (3%), ECM-affiliated proteins (3%) and ECM regulators (6%) ( Fig. 2A). The top 6 ECM proteins based on identified numbers of peptides (mean of three batches) are shown in Fig. 2B. Specifically, Keratocan (playing a pivotal role in collagen fibrillogenesis) [25], prolargin (anchoring basement membranes) [26] and decorin ("decorates" collagen type I, involved in cell differentiation and collagen fibrillogenesis) were the major proteoglycans that contribute to collagen binding and matrix structure maintenance (Fig. 2B) [13,14]. The major glycoproteins identified in tECM extracts (< 100 kDa) were cartilage oligomeric matrix protein (COMP, thrombospondin 5), thrombospondin 1 (TSP 1) and TSP 4 ( Table 2). The proteins from the TSP family can bind with different ECM proteins and help with ECM synthesis [13]. STRING analyses resulted in a dense network of proteins with two highly connected clusters centered around ECM-receptor interaction and collagen formation (Fig. 2C).
Our tECM proteomics results were further compared with current literature and are shown in Table 3. Using MS-based proteomics analysis, 19 ECM proteins (< 100 kDa) were identified in our work, while 34 to 85 ECM proteins were reported in the literature. Specifically, in our study, collagens (COL1A1 and COL1A2) and the non-collagenous proteins (DCN, FMOD, COMP and PRELP) were identified in tECM (< 100 kDa) [26][27][28][29]. Functional analysis of tECM (< 100 kDa) indicated that the principal GO processes were "extracellular matrix and collagen binding, " and two significant pathways, i.e., ECM-receptor interaction and focal adhesion, were identified by STRING analysis. Key ECM proteins and functional characterization identified in tECM were similar to other proteomic studies that characterized tendon ECM composition (Table 3) [26,29].
Taken together, the tECM protein composition data showed that our urea-based tECM preparation represents an effective and highly reproducible approach of extracting bioactive ECM components from tendon. tECM contains multiple key tendon ECM proteins, as well as components involved in GO functions (ECMreceptor interaction and collagen formation) and signaling pathways (ECM-receptor interaction and focal adhesion), which were similarly reported in the literature.

Comparison between pro-tenogenic activity of tECM and Col1 on hASCs
Protein composition analysis showed that tECM contained multiple key tendon ECM components in addition  to collagens, suggesting that tECM could exhibit superior pro-tenogenic bioactivity on hASCs compared to Col1 solution alone. This hypothesis was tested using hASCs sorted by flow cytometry for mesenchymal stem cell characteristics (positive markers: CD44, CD73, CD90 and CD105; negative markers: CD11b, CD19, CD34, CD45 and HLA-DR) (data not shown). The self-renewal (CFU-F assay) and multi-differentiation potential (adipogenesis, osteogenesis and chondrogenesis) of the sorted hASCs were validated (Fig. 3A). hASCs, Passages 4-7, were used in subsequent studies [16]. Hydroxyproline assay was performed to quantify the collagen content of tECM, revealing approximately 0.2 mg/mL collagen in 1 mg/mL tECM. Therefore, tenogenic differentiation of hASCs cultured in tECM (10% v/v), Col1 (2% v/v or 10% v/v) and control basal medium (FBS) for 4 or 6 days were assessed by immunofluorescence staining of tenogenesisassociated markers (SCX, COL1 and TNC), F-actin staining and DAPI nuclear staining (Fig. 3B). DAPI-based cell counting showed significantly increased cell proliferation More pronounced immunostaining of SCX, COL1 and TNC, and F-actin staining was also observed in the tECM-treated group compared to the other three groups. Specifically, enhanced nuclear staining of SCX, a tendon-specific transcription factor, as well as dense extracellular collagen and TNC fibril network were found in tECM-treated, but not Col1-treated hASCs. Normalized, semi-quantitative analyses of immunofluorescence staining intensity at culture days 4 and 6 validated a trend of more intense staining of tenogenesis-associated markers in the tECM group compared to the FBS and Col1 (2% v/v or 10% v/v) groups (Fig. 3B). Taken together, these findings showed that treatment with tECM enhanced hASC proliferation and tenogenic differentiation compared to Col1 treatment. Thus, while Col1 is the major ECM component of tendon, other non-collagenous tendon ECM components are likely to contribute to its pro-tenogenesis bioactivity.

Transcriptomic landscape of tECM-driven pro-tenogenic differentiation of hASCs
To investigate the molecular pathways and mechanisms underlying the tECM-driven pro-tenogenic differentiation of hASCs, total cellular RNA was extracted from hASCs cultured under three different conditions (FBS, tECM (10% v/v) or Col1 (10% v/v)) for RNA-Seq analysis (Fig. 4A). RNA-Seq analysis was performed using three independent isolates (biological replicates) of each culture condition and sequenced to 61, 482, 962-88, 727, 892 raw reads per library. The similarity of expression profiles was determined by Pearson correlation coefficient (PCC) analysis, which showed a high correlation (R 2 ranging from 0.84 to 0.98) in gene expression profiles among 3 isolates as visualized using scatter plotting (Fig. 4B). Taken together, these data show a high degree of similarity among different biological replicates for each culture condition. To analyze and compare gene expression among the three culture groups (tECM versus FBS, Col1 versus FBS, tECM versus Col1), an R package-DESeq was used as described previously [30,31]. A total of 584 genes were differentially expressed (FDR < 0.05) with an absolute fold change of 2 or greater between comparisons (Fig. 4C). In particular, 365 DEGs were found in the tECM group compared to the FBS group, with 159 upregulated genes and 206 downregulated genes as shown in the volcano plot. In addition, 411 genes were differentially expressed in the tECM group compared with the Col1 group, with 135 upregulated genes and 276 downregulated genes. On the other hand, no DEG was found between Col1 and FBS groups (Fig. 4D).
Heat map and volcano plot highlighted that the tECM group exhibited distinct gene expression profiles compared with the other two groups, while the difference in gene expression between Col1 and FBS groups was minor (Fig. 4C, D). Specifically, more downregulated genes (around 70%) were shown in the tECM group compared with Col1 group (Fig. 5A). The top ten upregulated and downregulated genes as well as their related biological functions, which include regulation of cell fate, stemness, proliferation and chemotaxis as well as ECM metabolism, are shown in Fig. 5A and B.

Functional annotation and pathway analysis between the tECM and Col1 treatment groups
To explore and functionally classify differentially regulated genes (> twofold changes) between the tECM and Col1 groups, GO enrichment, KEGG pathway and GSEA analyses were performed. tECM and Col1 groups generally exhibited differential enrichment in GO terms. The upregulated genes in tECM group compared to Col1 group were found to be enriched in: (1) molecular functions, e.g., "RNA binding" and "catalytic activity"; (2) biological processes associated with cell behaviors, e.g., "cell division" and "cell cycle process"; and (3) cellular components associated with "chromosomal region, " "condensed chromosome" and "centromeric region. " Relative to tECM group, the Col1 group showed up-regulation of Table 3 Comparison of tendon ECM proteomic study methodology and results
Additionally, KEGG pathway analysis was performed to compare signaling network for genes with > twofold changes between the tECM and Col1 groups. The results showed that different signaling pathways were activated in hASCs between tECM and Col1 treatments. tECM group activated more cell activity-related pathways, such as "cell cycle progression and proteasome" pathways, which are involved in many essential biological processes, e.g., cell cycle and DNA replication. Col1 treatment activated more ECM-related signaling pathways, such as "focal adhesion and ECM-receptor interaction, " which play essential roles in cell-ECM interaction and the maintenance of cell/tissue structure and function (Fig. 7A).
GSEA is designed to detect modest but coordinated changes in the expression of functionally related gene groups [24,33]. This analysis was performed on all expressed genes, thus addressing some inherent limitations of DEG-centric analyses [33]. GSEA was performed based on KEGG pathway gene sets comparing the tECM and Col1 groups (NES > 1, P < 0.05, and FDR < 0.25) ( Table 4). Similar to the KEGG pathway results, the tECM group was significantly enriched in pathways associated with molecular activities, e.g., "cell cycle" (FDR = 0.006) and "DNA replication" (FDR = 0.011) Fig. 4 RNA-Seq transcriptome analysis. A hASCs were cultured with FBS, Col1 or tECM for 6 days and total cellular RNA was harvested for RNA-Seq analysis. B Scatter plots indicate the correlation (r 2 ) between the isolates under different culture conditions. C Heat map of transcriptome analysis for 584 differentially expressed genes (DEGs) from three culture groups. D Volcano plots of transcriptome: red spots, upregulated genes; blue spots, downregulated genes; and black spots, unchanged genes. DEGs are designated for those exhibiting > twofold changes and FDR < 0.05  (Fig. 7B). Specifically, the core enrichment genes of tECM group were found to be closely related to cell proliferation. For example, genes such as proliferating cell nuclear antigen (PCNA) and minichromosome maintenance (MCM), well-established markers for cell proliferation [34], were highly enriched in the KEGG cell cycle gene sets in the tECM group compared to the Col1 group (Fig. 7B).  Taken together, the results from the GO enrichment, KEGG pathway and GSEA analyses indicated that Col1treated hASCs predominantly exhibited ECM-associated processes, while tECM-treated hASCs expressed a mixture of ECM-and proliferation-associated response. These findings provide partial explanation of the enhanced proliferation and pro-tenogenesis effects of tECM treatment on hASCs.

Assessment of tenogenesis-associated genes between the tECM and Col1 treatment groups
To further assess the molecular mechanisms of the protenogenesis activity of tECM on hASCs, gene expression between tECM and Col1 groups was directly compared in terms of proliferation and tenogenesis-associated genes.
For expression of tenogenesis-associated genes, our analysis focused on tenogenic transcription factors, ECM and growth factor signaling molecules (Fig. 8B, C). For transcription factors, SCX, mohawk (MKX) and early growth response transcription factors (EGR1 and EGR2) play critical roles in tendon lineage differentiation and are required for the regulation of collagen fibrils [37,38]. Our data showed that the tECM group exhibited higher expression of SCX and MKX, but lower expression of EGR1 and EGR2 compared with Col1 treatment group. For tendon ECM-related genes, in addition to collagens, small leucine-rich proteoglycans (SLRPs), members of the leucine-rich repeat protein family, members of the TSPs family, matrix metalloproteinases (MMPs) and tissue inhibitor of metalloproteinases (TIMPs) have been shown to be important regulators for ECM synthesis and remodeling [14]. In our transcriptome dataset, the tECM group showed higher expression of collagen type I alpha 1 chain (COL1A1), TNC, biglycan (BGN), periostin (POSTN) and thrombospondins 4 (THBS4), whereas the Col1 group showed higher expression in decorin (DCN), prolargin (PRELP), fibromodulin (FMOD), fibronectin (FN1) and COMP. Importantly, most MMPs were upregulated and TIMPs were downregulated in tECM group compared to Col1 group, suggesting active ECM remodeling (Fig. 8B). For growth factors, the tECM group showed downregulated expression of some well-characterized tenogenic signaling growth factors, such as FGF, PDGF and IGF, with upregulated expression of TGF-β receptor 1 (TGFBR1) and BMP2 (Fig. 8C). Interestingly, the expression of tenomodulin (TNMD), another tendonspecific membrane glycoprotein, could not be detected in either tECM or Col1 treated groups. Additionally, in order to validate the gene expression profiles obtained from RNA-seq analysis, the expression of the top 10 upregulated genes, as well as selected proliferation and tenogenic markers, was further assayed by qPCR. Comparative qPCR analysis showed that, compared to Col1 (10% v/v) group, tECM-treated hASCs exhibited significantly increased expression of genes, including most of the top 10 upregulated genes identified in RNA-seq, proliferation gene MKI67 and tenogenesisassociated genes (SCX and BMP2) at days 6. In comparison, no significant difference was observed in expression of these genes between the FBS and Col1 (10% v/v) groups (Fig. 9A, B). These qPCR results thus validated the expression profile obtained using RNA-seq.
Taken together, these results showed that tECM treatment induced in hASCs a gene expression profile distinct from that in the Col1 treated group. In particular, the tECM group showed higher expression of cell cycle-and tenogenesis-associated genes, likely contributing to the augmented cell proliferation and the pro-tenogenic effect of tECM on hASCs.

Discussion
We have previously developed a urea-based, non-proteolytic method to extract bovine tendon ECM (tECM), which exhibited strong proliferation and tenogenic differentiation effects on hASCs [15,16] and has potential application for tendon repair and regeneration. In this study, we have further characterized the protein composition of tECM, as well as its pro-tenogenic bioactivity and the corresponding transcriptome profile of hASCs cultivated in tECM, in comparison with those exposed to Col1. The results showed that: (1) the ureaextracted tECM contained around both collagen (20% w/w) and other non-collagenous tendon ECM components. Compared to the current MS-based proteomic studies on tendon ECM composition, our urea-extracted tECM contained multiple key tendon ECM proteins, as well as those that are involved in similar GO functions (ECM-receptor interaction and collagen formation) and signaling pathways (ECM-receptor interaction and focal adhesion); (2) compared to Col1 treatment, tECM supplementation enhanced hASC proliferation and promoted tenogenic differentiation as well as induced distinct gene expression profiles.
Optimal ECM extraction for biomedical applications needs to meet two key criteria, i.e., high efficiency of extraction and robust maintenance of the bioactivity of the ECM components, which is largely dependent on extraction protocols [39]. Acid-pepsin digestion has been the conventional method to extract tissue ECM for clinical applications [10]. While acid-pepsin is mostly efficient for the isolation of collagen, as most proteins are susceptible to pepsin-mediated proteolytic digestion, except the triple helical collagens; thus, as expected and reported by us and others [10,40], pepsin solubilized ECM products contained mostly collagens and other structural ECM proteins, and few non-collagenous components [10,40]. Moreover, pepsin digestion can alter the bioactivity of essential bioactive molecules, e.g., the potency of growth factors can be significantly reduced upon pepsin treatment [41][42][43]. In this work, we have chosen urea as a reagent to extract tendon ECM fractions, since urea denatures via disruption of hydrogen bonding that mediates lipid-lipid, lipid-protein and protein-protein interactions to solubilize tissue ECM in a manner that allows renaturation [44]. In addition, Naba et al. revealed that the abundance of fibrillar collagens could lower the extraction efficacy of other bioactive ECM-associated components. Thus, using urea as an extraction reagent can help increase the relative content of non-collagenous components since crosslinked fibrillar collagen remains largely insoluble in urea [39]. Furthermore, other studies showed that urea-extractable collagen resulted in different fibroblast behavior, such as higher cell motility, suggesting the suitability of urea-extracted ECM for biotechnological applications and tissue engineering [45].
To characterize urea-extracted tendon ECM, we investigated tECM protein patterns via SDS-PAGE, collagen content via hydroxyproline assay and protein composition via MS-based analysis and comparison with the current literature. Our results showed that urea-extracted tECM yielded consistent protein patterns (as detected by SDS-PAGE), with collagens (20% w/w) and at least 17 non-collagenous proteins (< 100 kDa) based on MS analysis. These proteins include a number of SLRPs, such as keratocan, fibromodulin and decorin, which are known to be involved in collagen fibril maturation and collagen fibrillogenesis [13,14,25]. As for glycoproteins, the top identified components are those of the TSP protein family, which contribute to regulating diverse cellular processes such as inflammation, cell migration and cell Fig. 9 qPCR validation of RNA-seq results. A qPCR analysis of expression of genes identified by RNA-seq experiments to be positively differentially expressed in the tECM group, including the top 10 upregulated genes as well as proliferation-related and tenogenic-associated genes. B Comparison of gene expression levels between RNA-seq and qPCR (n = 5 isolates; *, P < .05; **, P < .01; ***, P < .001) proliferation [13]. Other tendon ECM components previously identified in various MS studies, including decorin, biglycan, lumican and COMP, are also included in our tECM preparation [29,46]. Additionally, functional analysis of tECM (< 100 kDa) using MS-based proteomics indicated similar principal GO processes and signaling pathways compared to the existing literature (Table 3) [26,29]. Thus, our work suggests that our urea-based tECM preparation represents an efficient and highly reproducible approach of extracting these bioactive ECM components from tendon.
Meanwhile, we also noted that we had relatively low protein coverage (n = 36 proteins) identified by proteomics analysis, relative to the data reported for other tendon extracts (e.g., n = 215 [29] and n = 92 [27]). For example, Kharaz et al. [29] used GuHCL for tendon protein extraction (long digital extensor tendons) and identified around 215 proteins including 85 (around 40%) ECM proteins. Thorpe et al. used 0.1% Rapigest (a surfactant which offers a simple detergent-based extraction of tendon tissue) to extract proteins from the superficial digital flexor tendon and identified around 92 proteins with 34 (around 37%) ECM proteins [27,47]. The discrepancy in comparison might be derived from several factors: (1) Our current study only focused on tECM proteins with molecular weight < 100 kDa, using an in-gel digestion protein extraction kit. This will eliminate ECM proteins with molecular weights > 100 kDa, such as collagens, fibronectin, fibrillin and other large proteoglycans (versican and aggrecan). (2) Compared to urea, which is a relatively mild, non-chaotropic reagent for ECM extraction, the use of GuHCL, a strong chaotropic denaturing reagent, could extract a larger number of intracellular and ECM proteins, while RapiGest extraction could result in an increased amount of identified collagens [47]. (3) Ingel trypsin digestion was used for protein sample preparation in our study. Sample loss was often more severe for gel-based method because extraction of peptides from a gel was inherently less efficient [48]. However, it is noteworthy that although GuHCL or RapiGest may be more efficient and extensive in protein extraction, urea is a milder denaturant and is thus expected to better maintain ECM bioactivity, which is important for the use of the extracted ECM for biomedical applications.
To gain insight into the mechanism of tECM bioactivity on hASCs, transcriptome expression profiles of hASCs cultivated in tECM or Col1 containing medium were compared by RNA-Seq and bioinformatics analysis. As shown in Fig. 4, tECM treatment triggered distinct hASC gene expression profiles compared to those of FBS and Col1 treatments, whereas gene expression between Col1 and FBS groups was relatively similar. GO enrichment and KEGG pathway analyses indicated different molecular mechanisms that the tECM group regulated cell proliferation, cell cycle and DNA replication pathways, while the Col1 group enhanced extracellular structure organization. Sun et al. performed single-cell RNA-Seq on human primary Wharton's Jelly-derived MSCs (WJMSC) for studying functional characteristics associated with cell proliferation, development and inflammation response. Their work showed that upregulated genes in the subpopulations of WJMSCs, which possessed a higher proliferative capacity, were significantly enriched in the DNA replication pathway and cell cycle process [49]. Chen et al. characterized the tendon stem/progenitor cells (TSPC) from postnatal rat Achilles tendon tissue at different stages of development by microarray analysis. Their data suggested that TSPCs-7d (TSPCs isolated at postnatal day 7) had significantly higher proliferation ability, characterized by upregulated genes enriched in the GO terms related to mitosis, cell division, cell cycle and DNA polymerase-related regulation [50]. Therefore, the identification of cell cycle process and signaling pathway via DEGs, GO and KEGG pathway analyses upon tECM treatment suggests that these pathways are involved in the bioactivity of tECM on hASC behaviors, including proliferation and lineage-specific differentiation.
Proliferation and tenogenesis-associated gene expression, including tenogenic transcription factors, tendon ECM and tenogenic growth factor signaling molecules, were compared between tECM and Col1 groups. Although expression of SCX, MKX, COL1A1, TNC and BGN was upregulated in the tECM group, Col1 treatment also induced higher expression of some tenogenesis-associated markers, such as EGR1, DCN, PRELP, FMOD and COMP, which have all been shown to be involved in tenogenic regulation [51]. Interestingly, quite a number of MMPs were upregulated and their inhibitors TIMPs were downregulated in the tECM group compared with Col1 group, indicating high matrix remodeling activity [15]. Unexpectedly, growth factors and their receptors whose activities have been best characterized during tendon healing, such as FGFs, PDGF, IGFs, TGF-β and BMP2, did not show enhanced expression in the tECM compared with Col1 group [6]. Taken together, this expression profile suggests that the pro-tenogenic bioactivity of tECM may not involve these growth factor signals, but instead the synergistic contribution from multiple key ECM components.

Conclusions
In summary, in this investigation we have characterized the protein composition of tECM as well as analyzed its pro-tenogenesis bioactivity and compared the transcriptome expression profiles of tECM-and Col1treated hASCs. Our findings showed that urea-extracted tECM retained some collagens (~ 20% w/w) and were significantly enriched in many other lower molecular weight, non-collagenous ECM components. Compared to Col1 treated hASCs, tECM enhanced hASC tenogenic differentiation and exhibited distinct gene expression profiles. Thus, our tECM preparation represents an effective and highly reproducible approach of extracting bioactive ECM components from tendon and is a practically useful method for promoting tenogenesis. Additionally, the findings from this study provide essential clues into the potential mechanism action of tECM on tenogenic differentiation and thus present a rational basis for the application of tECM in tendon healing and regeneration.